Make analytics-engine an optional dependency by bowenlan-amzn · Pull Request #5403 · opensearch-project/sql

bowenlan-amzn · 2026-05-04T04:47:02Z

Summary

Makes the analytics-engine plugin an optional dependency of opensearch-sql. Installing opensearch-sql on a distribution without analytics-engine now succeeds; all PPL/SQL queries route through the in-plugin v2 Calcite engine (Lucene-backed). When analytics-engine is installed, the existing Mustang path (parquet-backed indices → QueryPlanExecutor) continues to work.

Approach

extendedPlugins = ['opensearch-job-scheduler', 'analytics-engine;optional=true']
Move QueryPlanExecutor from a constructor parameter to an @Inject(optional = true) setter. Guice invokes the setter only when the binding exists; absent analytics-engine it's silently skipped and unifiedQueryHandler stays null. PPL routing then falls through to the v2 Calcite-to-OpenSearch path unconditionally.
Promote analytics-framework.jar from compileOnly to api so QueryPlanExecutor/SchemaProvider resolve from sql's own classpath when analytics-engine is absent. analytics-engine.jar stays compileOnly — its classes (e.g. OpenSearchSchemaBuilder) are reached only through RestUnifiedQueryAction, which never loads when the setter is skipped.
Drop the bundlePlugin { exclude '*.jar' } block. Required so sql's own classpath is self-sufficient when analytics-engine is absent (guava, calcite, jackson, etc. must resolve from sql's own jars). See "Why the exclude block has to go" below for the full tradeoff.

Addressing the diff analyzer findings

All three flagged items are intentional and follow documented OpenSearch plugin mechanisms:

;optional=true syntax — parsed in PluginInfo.java:229 ("optional=true".equals(dependency[1])). Used in production elsewhere in the OpenSearch plugin ecosystem. Not ad-hoc syntax.
api promotion for analytics-framework — required so that sql's own classloader can resolve QueryPlanExecutor/SchemaProvider when analytics-engine is absent. When it IS present, the class-identity question is resolved by parent-first delegation: analytics-engine supplies the parent classloader (via extendedPlugins), and its copy of every framework class wins. sql's bundled jar is dead code in that scenario.
Removed exclude block — see below. The short answer: required for standalone install, accepted by OpenSearch for optional deps (PluginsService.java:763 skips the jar-hell cross-check), runtime collisions prevented by parent-first classloader delegation. Rests on a version-consistency assumption documented below.

Why the exclude block has to go

The excluded jars (guava, calcite, jackson, commons-*, etc.) are transitive deps sql needs to run. Previously sql didn't bundle them because analytics-engine was a required dep and shipped them — the shared classloader supplied sql at runtime. With ;optional=true sql can't rely on that parent being present, so it has to carry its own copies or it won't load at all when analytics-engine is absent.

How OpenSearch accepts the duplicates when both plugins are installed:

PluginsService.checkBundleJarHell (line 763) skips the cross-plugin jar-hell check for extended plugins marked ;optional=true. OpenSearch explicitly supports this shape.
At runtime, parent-first classloader delegation routes every class lookup to analytics-engine's copy first. sql's bundled copies are never touched when the parent is there.

The assumption this rests on: sql and analytics-engine keep the same transitive versions. Today both inherit from the OpenSearch root versions.* catalog, so guava, calcite-core, etc. match by construction. If they ever drift, parent-first would silently pick analytics-engine's version — if sql compiled against a newer API, it would break at call time. A future CI check comparing transitive sets would catch drift early.

Alternatives considered: keeping the exclude block conditional on build-time knowledge of deployment shape isn't viable — Gradle doesn't know whether the target cluster has analytics-engine. Shipping two zips (standalone vs with-analytics) isn't supported by the OpenSearch plugin distribution model. Removing the block is the cleanest option that actually delivers standalone install.

End-to-end test procedure

Verified on a 2-node sandbox cluster (OpenSearch main + analytics-engine from #21501).

Prerequisites

# 1. Build the native Rust library (needed for sandbox native-bridge module)
cd ~/code/opensearch/OpenSearch   # or your OpenSearch checkout
./gradlew :sandbox:libs:dataformat-native:buildRustLibrary -Dsandbox.enabled=true

# 2. Build and publish SQL plugin to mavenLocal
cd ~/code/opensearch/sql          # this branch
./gradlew :opensearch-sql-plugin:publishToMavenLocal

Start cluster

cd ~/code/opensearch/OpenSearch
./gradlew run -PnumNodes=2 -Dsandbox.enabled=true \
  -PinstalledPlugins="['opensearch-job-scheduler:3.7.0.0-SNAPSHOT', \
    'arrow-flight-rpc', 'analytics-engine', 'parquet-data-format', \
    'analytics-backend-datafusion', 'analytics-backend-lucene', \
    'composite-engine', 'opensearch-sql-plugin:3.7.0.0-SNAPSHOT']" \
  -Dtests.jvm.argline="-Dopensearch.experimental.feature.pluggable.dataformat.enabled=true \
    -Dopensearch.experimental.feature.transport.stream.enabled=true"

Verify Calcite/Janino path (Lucene index)

These UDFs trigger SqlFunctions.<clinit> which loads LevenshteinDistance (commons-text) via Janino code-gen — the classloader path that requires both plugins to bundle their own deps correctly.

# Create a regular Lucene index
curl -s -X PUT localhost:9200/t -H 'Content-Type: application/json' -d '{
  "settings": {"number_of_shards":1, "number_of_replicas":0},
  "mappings": {"properties": {"name": {"type":"keyword"}, "value": {"type":"integer"}}}
}'
curl -s -X POST "localhost:9200/t/_bulk?refresh=true" -H 'Content-Type: application/x-ndjson' -d '
{"index":{}}
{"name":"item1","value":10}
{"index":{}}
{"name":"item2","value":20}
'

# Queries that exercise Janino/SqlFunctions
curl -s -XPOST 'localhost:9200/_plugins/_ppl' -H 'Content-Type: application/json' \
  -d '{"query": "source=t | eval h = md5(name) | fields name, h"}'
# Expected: md5 hashes (cabf67b..., 235bac...)

curl -s -XPOST 'localhost:9200/_plugins/_ppl' -H 'Content-Type: application/json' \
  -d '{"query": "source=t | eval u = upper(name) | fields name, u"}'
# Expected: ITEM1, ITEM2

Verify Mustang path (parquet index)

curl -s -X PUT localhost:9200/parquet_test -H 'Content-Type: application/json' -d '{
  "settings": {
    "number_of_shards":2, "number_of_replicas":0,
    "index.pluggable.dataformat.enabled": true,
    "index.pluggable.dataformat": "composite",
    "index.composite.primary_data_format": "parquet",
    "index.composite.secondary_data_formats": []
  },
  "mappings": {"properties": {"value": {"type":"integer"}}}
}'

for i in $(seq 0 19); do printf '{"index":{"_id":"%s"}}\n{"value":7}\n' $i; done | \
  curl -s -X POST "localhost:9200/parquet_test/_bulk?refresh=true" \
    -H 'Content-Type: application/x-ndjson' --data-binary @-
curl -s -X POST "localhost:9200/parquet_test/_flush"

curl -s -XPOST 'localhost:9200/_plugins/_ppl' -H 'Content-Type: application/json' \
  -d '{"query": "source=parquet_test | stats sum(value) as total"}'
# Expected: total = 140

Results

Query	Path	Result
`md5(name)`	Calcite → Janino → SqlFunctions	`cabf67b...`
`upper(name)`	Calcite → Janino → SqlFunctions	`ITEM1`
`stats sum(value)` on parquet	Mustang → DataFusion	`140`

All pass — SQL plugin bundles its own calcite/commons-text/guava and the classloader resolves correctly in both configurations.

Not in this PR

The routing heuristic (table name prefix parquet_) is unchanged. A proper setting-based or data-format-based trigger is a separate discussion.
No changes on the analytics-engine side or in core OpenSearch. The mechanism uses only existing platform features.
No CI guard for transitive version consistency between sql and analytics-engine. Worth adding as a follow-up; today it's a manual convention.

Check List

New functionality includes testing
New functionality has been documented
Commits are signed off (DCO)

github-actions · 2026-05-04T04:48:08Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit e271ff6.

Path	Line	Severity	Description
plugin/build.gradle	164	high	Dependency scope changed from 'compileOnly' to 'api' for analytics-framework-3.7.0-SNAPSHOT.jar. This promotes a previously compile-only local artifact to a transitive API dependency, altering the plugin's dependency graph and class loading at runtime. Mandatory flag: dependency configuration change.
plugin/build.gradle	58	high	extendedPlugins declaration changed from 'analytics-engine' to 'analytics-engine;optional=true'. This modifies how the plugin loader resolves and links the analytics-engine plugin at startup, potentially altering the shared classloader chain. Mandatory flag: build plugin/dependency configuration change.
plugin/build.gradle	60	medium	The entire bundlePlugin exclusion block (20+ jar exclusions preventing jar hell with analytics-engine's classloader) has been removed. Without these exclusions, the SQL plugin bundle may now include duplicate classes previously provided by analytics-engine, risking ClassLoader conflicts or silently shadowing classes at runtime. This warrants verification that the optional-dependency refactoring fully accounts for the prior deduplication logic.

The table above displays the top 10 most important findings.

Total: 3 | Critical: 0 | High: 2 | Medium: 1 | Low: 0

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@Inject

sql plugin previously declared analytics-engine as a hard dependency: extendedPlugins = ['opensearch-job-scheduler', 'analytics-engine'] Installing opensearch-sql on a distribution that doesn't ship analytics-engine failed with: Missing plugin [analytics-engine], dependency of [opensearch-sql] Marking the dep ;optional=true alone is not enough — TransportPPLQueryAction Guice-injects QueryPlanExecutor on its constructor, and Guice's OpenSearch fork rejects a required constructor parameter whose binding is missing at injector-build time ("constructors cannot be optional"). Move QueryPlanExecutor from a required constructor parameter to an @Inject(optional=true) setter. Guice invokes the setter only when a binding for QueryPlanExecutor<RelNode, Iterable<Object[]>> exists — i.e. when analytics-engine's createGuiceModules has run and bound DefaultPlanExecutor. Absent analytics-engine, the setter is silently skipped, unifiedQueryHandler stays null, and all PPL queries route to the v2 Calcite-to-OpenSearch path already in the sql plugin. Drop the bundlePlugin exclude list. OpenSearch's jar-hell check skips the extended-plugin cross-check when the dep is marked optional (PluginsService.java:763), so sql can bundle every jar it needs to run self-sufficiently. When analytics-engine is installed, parent-first classloader delegation still lets analytics-engine's copies win for any shared class; sql's bundled copies sit idle. Promote analytics-framework.jar from compileOnly to api so QueryPlanExecutor is reachable from sql's own classloader when the plugin is absent. analytics-engine.jar stays compileOnly (required only for OpenSearchSchemaBuilder, which lives in the engine plugin and is reached only through RestUnifiedQueryAction — a class that never loads when the setter is skipped). Validated on a live 2-node cluster in both configurations: - With analytics-engine installed: legacy and analytics PPL both return expected rows; routing to the analytics path still fires for parquet_-prefixed indices. - Without analytics-engine (only opensearch-job-scheduler + opensearch-sql installed): cluster starts cleanly, PPL and SQL queries against Lucene indices return expected rows, parquet_-prefixed lookups return a clean IndexNotFoundException instead of a NullPointerException or NoClassDefFoundError. Signed-off-by: bowenlan-amzn <[email protected]>

mch2 · 2026-05-04T17:02:10Z

+   * Absent analytics-engine, Guice silently skips the call; {@link #unifiedQueryHandler}
+   * stays null and all PPL routing falls through to the legacy engine.
+   */
+  @Inject(optional = true)


does this work?

bowenlan-amzn marked this pull request as ready for review May 4, 2026 05:23

bowenlan-amzn requested review from LantaoJin, RyanL1997, Swiddis, acarbonetto, ahkcs, anirudha, dai-chen, joshuali925, mengweieric, noCharger, penghuo, ps48, qianheng-aws, songkant-aws, vamsimanohar, ykmr1224 and yuancu as code owners May 4, 2026 05:23

mch2 reviewed May 4, 2026

View reviewed changes

bowenlan-amzn force-pushed the analytics-optional branch from 280d3b4 to e271ff6 Compare May 4, 2026 17:42

bowenlan-amzn added the enhancement New feature or request label May 4, 2026

ahkcs approved these changes May 4, 2026

View reviewed changes

ahkcs mentioned this pull request May 6, 2026

Exclude jsr305 from SQL bundle to fix jar hell with arrow-flight-rpc #5409

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make analytics-engine an optional dependency #5403

Make analytics-engine an optional dependency #5403
bowenlan-amzn wants to merge 1 commit intoopensearch-project:feature/mustang-ppl-integrationfrom
bowenlan-amzn:analytics-optional

bowenlan-amzn commented May 4, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 4, 2026 •

edited

Loading

Uh oh!

mch2 May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bowenlan-amzn commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Approach

Addressing the diff analyzer findings

Why the exclude block has to go

End-to-end test procedure

Prerequisites

Start cluster

Verify Calcite/Janino path (Lucene index)

Verify Mustang path (parquet index)

Results

Not in this PR

Check List

Uh oh!

github-actions Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Analyzer ❗

Uh oh!

mch2 May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bowenlan-amzn commented May 4, 2026 •

edited

Loading

github-actions Bot commented May 4, 2026 •

edited

Loading